|
The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. Robots are often used by search engines to categorize web sites. Not all robots cooperate with the standard; email harvesters, spambots and malware robots that scan for security vulnerabilities may even start with the portions of the website that they have been told to stay out of. The standard is different from, but can be used in conjunction with Sitemaps, a robot ''inclusion'' standard for websites. ==History== The standard was proposed by Martijn Koster,〔 (【引用サイトリンク】first=Koster )〕〔 〕 when working for Nexor in February, 1994〔 〕 on the ''www-talk'' mailing list, the main communication channel for WWW-related activities at the time. Charles Stross claims to have provoked Koster to suggest robots.txt, after he wrote a badly-behaved web crawler that caused an inadvertent denial of service attack on Koster's server. It quickly became a de facto standard that present and future web crawlers were expected to follow; most complied, including those operated by search engines such as WebCrawler, Lycos and AltaVista. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Robots exclusion standard」の詳細全文を読む スポンサード リンク
|